Regularized Tensor Factorizations and Higher-Order Principal Components Analysis
نویسنده
چکیده
High-dimensional tensors or multi-way data are becoming prevalent in areas such as biomedical imaging, chemometrics, networking and bibliometrics. Traditional approaches to finding lower dimensional representations of tensor data include flattening the data and applying matrix factorizations such as principal components analysis (PCA) or employing tensor decompositions such as the CANDECOMP / PARAFAC (CP) and Tucker decompositions. The former can lose important structure in the data, while the latter Higher-Order PCA (HOPCA) methods can be problematic in high-dimensions with many irrelevant features. We introduce frameworks for sparse tensor factorizations or Sparse HOPCA based on heuristic algorithmic approaches and by solving penalized optimization problems related to the CP decomposition. Extensions of these approaches lead to methods for general regularized tensor factorizations, multi-way Functional HOPCA and generalizations of HOPCA for structured data. We illustrate the utility of our methods for dimension reduction, feature selection, and signal recovery on simulated data and multi-dimensional microarrays and functional MRIs.
منابع مشابه
Sparse Higher-Order Principal Components Analysis
Traditional tensor decompositions such as the CANDECOMP / PARAFAC (CP) and Tucker decompositions yield higher-order principal components that have been used to understand tensor data in areas such as neuroimaging, microscopy, chemometrics, and remote sensing. Sparsity in high-dimensional matrix factorizations and principal components has been well-studied exhibiting many benefits; less attentio...
متن کاملSurvey on Probabilistic Models of Low-Rank Matrix Factorizations
Low-rank matrix factorizations such as Principal Component Analysis (PCA), Singular Value Decomposition (SVD) and Non-negative Matrix Factorization (NMF) are a large class of methods for pursuing the low-rank approximation of a given data matrix. The conventional factorization models are based on the assumption that the data matrices are contaminated stochastically by some type of noise. Thus t...
متن کاملPrincipal Cumulant Component Analysis
Multivariate Gaussian data is completely characterized by its mean and covariance, yet modern non-Gaussian data makes higher-order statistics such as cumulants inevitable. For univariate data, the third and fourth scalar-valued cumulants are relatively well-studied as skewness and kurtosis. For multivariate data, these cumulants are tensor-valued, higher-order analogs of the covariance matrix c...
متن کاملRegularized Iterative Reconstruction in Tensor Tomography Using Gradient Constraints
This paper investigates the iterative reconstruction of tensor fields in diffusion tensor magnetic resonance imaging (MRI). The gradient constraints on eigenvalue and tensor component images of the diffusion tensor were exploited. A computer-generated phantom was used in order to simulate the diffusion tensor in a cardiac MRI study with a diffusion model that depends on the fiber structure of t...
متن کاملAlgorithms for Nonnegative Tensor Factorization
Nonnegative Matrix Factorization (NMF) is an efficient technique to approximate a large matrix containing only nonnegative elements as a product of two nonnegative matrices of significantly smaller size. The guaranteed nonnegativity of the factors is a distinctive property that other widely used matrix factorization methods do not have. Matrices can also be seen as second-order tensors. For som...
متن کامل